Euclidean Bus Mobility and Route Optimization, A Comparison

Routes in Queens, New York City, NY

Author

Alan Vlakancic

Published

November 25, 2025

Introduction

This project will use stplanr transport modeling package to design an optimal transport route for bus or cycle routes in New York City. Stplanr is a transport planning visualization R package that can be used to plan transit networks, in addition to transit planning elements such as identifying transit catchment areas, origin/destination data and ride frequency visualizations, among others. In their white paper, the devlopers of stplanr call for an accountable, transparent and democratized transit planning system that doesn’t rely on proprietary and often vastly different data sources and data processing softwares. Although the package can visualize a whole host of data, this project will focus on comparing direct desire lines or “Euclidean” routes (as the crow flies), existing bus networks and stplanr’s optimized routes. To wit: this can map the efficiency of the bus routes compared to the most direct route possible if there were no built environment factors in the way.

Methods

To adequately compare desire lines, bus routes, and the most efficient routes with the current street network, this project will require at minimum three data sources. Each of these will be sourced separately and overlaid onto each other:

  • Data Source: ggmap data for basemap
    • Methods: This can be sourced directly into R by installing the package. ggmap can bring in data from Stadiamaps. This will require a Google API key to be registered.
    • Source: Getting Started with ggmap & Stadiamaps
  • Data Source: OpenStreetMap data for transit data, either bus or cycle routes
    • Methods: This can be sourced directly into R by installing the package. You may need to rationalize different projection systems to make sure they overlay correctly.
    • Source: Mapping with OpenStreetMap in R
  • Data Source: NYC Open Data for Bus Shelter locations
    Despite significant searching, there is no comprehensive bus stop dataset, so the project will focus on bus stop shelters, which are mapped via NYC Open Data.
    • Methods: This can be brought into R as a CSV file. Each bus stop shelter has longitude and latitude coordinates that align with ggmap and OpenStreetMap projections.
    • Source: NYC Bus Stop Shelters

Workflow:

Show code
library(ggmap)
library(stplanr)
library(osmdata)
library(sf)
library(tidyverse)
library(leaflet)
library(purrr)

bbox <- c(left = -73.96, bottom = 40.54, right = -73.70, top = 40.81)
#create bounding box for queens

nyc_map <- "data/"

#nyc basemap, downloaded from nyc open data. source: https://search.r-project.org/CRAN/refmans/ptools/html/nyc_bor.html

nyc_sf <- st_read(nyc_map)
Reading layer `nybb' from data source 
  `C:\Users\vlakanah\OneDrive - Alfred State College\01. Semesters\14. 2025 FA\GEO511\GEO511_FinalProject\data' 
  using driver `ESRI Shapefile'
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 913175.1 ymin: 120121.9 xmax: 1067383 ymax: 272844.3
Projected CRS: NAD83 / New York Long Island (ftUS)
Show code
#bring in nyc_map as a sf


shelters_sf <- read_csv("data/Bus_Stop_Shelter_20251020.csv")
#NOTE: REPLACE WITH DATA WHEN USING QUARTO!
#this brings in the bus stop shelter information. source: https://data.cityofnewyork.us/Transportation/Bus-Stop-Shelters/qafz-7myz

shelters_sf_fix <- st_as_sf(shelters_sf, coords = c("Longitude","Latitude"), crs = 4326)
#convert to sf object with the correct coordinate reference system

osmdata::set_overpass_url("https://overpass-api.de/api/interpreter")

osm_data <- opq(bbox = bbox) %>%
  add_osm_feature(key = "highway", value = c("primary","secondary")) %>%
  osmdata_sf()
#import data for primary and secondary highways from open street maps

shelters_sf_fix <- shelters_sf_fix %>%
  mutate(id = paste0("S", row_number()))
#add ID column for origin-destination pairs so they have a corresponding number

flow_all <- expand.grid(o = shelters_sf_fix$id, #create origin
                        d = shelters_sf_fix$id, #create destination
                        stringsAsFactors = FALSE) %>% #make sure they aren't factors
  filter(o != d) %>% # this remove self-pairs so O is not D
  mutate(trips = 1) %>% #add trip count of 1 for each pair
  sample_n(50) #sample 50 random paris to avoid blowing up the computer

desire_lines_all <- od2line(flow_all, zones = shelters_sf_fix, zone_code = "id") #use od2line function to create desire lines (euclidean) for all pairs

shelter_coords <- shelters_sf_fix %>%
  st_coordinates() %>%
  as.data.frame() %>%
  bind_cols(id = shelters_sf_fix$id)
#extract coordinates and bind with ID column

route_single <- function(o_id, d_id) { #function to create a single route between origin and destination
  o <- shelter_coords %>% filter(id == o_id) 
  #filter to get origin coordinates
  d <- shelter_coords %>% filter(id == d_id)
  #filter to get destination coordinates

  r <- try(route_osrm(from = c(o$X, o$Y),
                      to   = c(d$X, d$Y)), silent = TRUE)
#use try to catch errors  (e.g., no route found)
  if (inherits(r, "try-error")) return(NULL)
#if route found, return the route
  return(r)
}

routes_list <- purrr::map2(flow_all$o, flow_all$d, route_single)
#create routes for all origin-destination pairs using the route_single function
routes_list <- routes_list[!sapply(routes_list, is.null)]
#remove any NULL results (failed routes)
routes_sf <- do.call(rbind, routes_list)
#combine all routes into a single sf object


routes_projection <- st_transform(routes_sf, 32618)
desire_projection <- st_transform(desire_lines_all, 32618)
#ensures the correct, projected shapefile for computation not mapping

route_length <- st_length(routes_projection)
desire_length <- st_length(desire_projection)
#compute lengths

route_length <- as.numeric(route_length)
desire_length <- as.numeric(desire_length)
#convert lengths to numeric values

lengths_tbl <- tibble(
  route_m  = route_length,
  desire_m = desire_length,
  origin = flow_all$o,
  destination = flow_all$d
)
#create tibble to compare lengths in the final map w/ IDs

mean_route <- mean(route_length, na.rm = TRUE)
mean_desire <- mean(desire_length, na.rm = TRUE)
#calculate means for both route and desire lengths

percent_change <- ((mean_route - mean_desire) / mean_desire) * 100
#calculate percent change 

mean_lengths <- data.frame(
  type = c("Route Length", "Desire Line Length"),
  mean_length_m = c(mean_route, mean_desire)
)

# Add percent change (only for Route, NA for Desire)
mean_lengths <- mean_lengths %>%
  mutate(
    mean_length_km = mean_length_m / 1000,
    percent_change = c(percent_change, NA)
  )


nyc_leaflet  <- st_transform(nyc_sf, 4326)
roads_leaflet <- st_transform(osm_data$osm_lines, 4326)
desire_leaflet <- st_transform(desire_lines_all, 4326)
routes_leaflet <- st_transform(routes_sf, 4326)

desire_leaflet_popup <- paste0(
  "<b>Desire Line</b><br/>",
  "Origin: ", desire_leaflet$o, "<br/>",
  "Destination: ", desire_leaflet$d, "<br/>",
  "Desire Line Distance: ", round(lengths_tbl$desire_m / 1000), " km"
)
  
routes_leaflet_popup <- paste0(
  "<b>OSRM Route</b><br/>",
  "Origin: ", desire_leaflet$o, "<br/>",
  "Destination: ", desire_leaflet$d, "<br/>",
  "Route Distance: ", round(lengths_tbl$route_m / 1000), " km<br/>"
)

pal_desire <- colorNumeric(
  palette = "inferno",
  domain  = lengths_tbl$desire_m
)

pal_routes <- colorNumeric(
  palette = "inferno",
  domain  = lengths_tbl$route_m
)

selected_ids <- unique(c(flow_all$o, flow_all$d))
#get unique IDs of sampled shelters

selected_shelters <- shelters_sf_fix %>% 
  filter(id %in% selected_ids)
#filter shelters to only those that were sampled

leaflet() %>%
  addProviderTiles("CartoDB.Positron") %>%
  
  addPolygons(data = nyc_leaflet,
              color = "black", weight = 3,
              fillOpacity = 0.1,
              group = "NYC Boundary") %>%
  
  # Desire lines with gradient colors
  addPolylines(
    data = desire_leaflet,
    color = pal_desire(lengths_tbl$desire_m),
    weight = 3,
    opacity = 0.7,
    popup = desire_leaflet_popup,
    group = "Desire Lines"
  ) %>%
  
  # OSRM routes with gradient colors
  addPolylines(
    data = routes_leaflet,
    color = pal_routes(lengths_tbl$route_m),
    weight = 3,
    opacity = 0.8,
    popup = routes_leaflet_popup,
    group = "OSRM Routes"
  ) %>%
  
  # Add color legends so users know what the gradient means
  addLegend(
    pal = pal_desire,
    values = lengths_tbl$desire_m,
    title = "Desire Line Distance (m)",
    position = "bottomright",
    group = "Desire Lines"
  ) %>%
  
  addLegend(
    pal = pal_routes,
    values = lengths_tbl$route_m,
    title = "OSRM Route Distance (m)",
    position = "bottomleft",
    group = "OSRM Routes"
  ) %>%
  
  # Add shelters (same as before)
  addCircleMarkers(data = shelters_sf_fix,
                   color = "blue",
                   radius = 1,
                   popup = ~id,
                   group = "Bus Shelters") %>%
  
  addCircleMarkers(
    data = selected_shelters,
    color = "purple",
    radius = 6,
    fillOpacity = 0.9,
    group = "Sampled Shelters"
  ) %>%
  
  addLayersControl(
    overlayGroups = c("NYC Boundary", "Sampled Shelters",
                      "Desire Lines", "OSRM Routes",
                      "Bus Shelters"),
    options = layersControlOptions(collapsed = FALSE)
  )